The best defense is a good offense: Countering black box attacks by predicting slightly wrong labels
نویسندگان
چکیده
Black-Box attacks on machine learning models occur when an attacker, despite having no access to the inner workings of a model, can successfully craft an attack by means of model theft. The attacker will train an own substitute model that mimics the model to be attacked. The substitute can then be used to design attacks against the original model, for example by means of adversarial samples. We put ourselves in the shoes of the defender and present a method that can successfully avoid model theft by mounting a counter-attack. Specifically, to any incoming query, we slightly perturb our output label distribution in a way that makes substitute training infeasible. We demonstrate that the perturbation does not affect the ordinary use of our model, but results in an effective defense against attacks based on model theft.
منابع مشابه
Countering Adversarial Images using Input Transformations
This paper investigates strategies that defend against adversarial-example attacks on image-classification systems by transforming the inputs before feeding them to the system. Specifically, we study applying image transformations such as bit-depth reduction, JPEG compression, total variance minimization, and image quilting before feeding the image to a convolutional network classifier. Our exp...
متن کاملCountering Adversarial Images
This paper investigates strategies that defend against adversarial-example attacks on image-classification systems by transforming the inputs before feeding them to the system. Specifically, we study applying image transformations such as bit-depth reduction, JPEG compression, total variance minimization, and image quilting before feeding the image to a convolutional network classifier. Our exp...
متن کاملCountering Adversarial Images
This paper investigates strategies that defend against adversarial-example attacks on image-classification systems by transforming the inputs before feeding them to the system. Specifically, we study applying image transformations such as bit-depth reduction, JPEG compression, total variance minimization, and image quilting before feeding the image to a convolutional network classifier. Our exp...
متن کاملCountering Adversarial Images
This paper investigates strategies that defend against adversarial-example attacks on image-classification systems by transforming the inputs before feeding them to the system. Specifically, we study applying image transformations such as bit-depth reduction, JPEG compression, total variance minimization, and image quilting before feeding the image to a convolutional network classifier. Our exp...
متن کاملNew Fixed Point Attacks on GOST2 Block Cipher
GOST block cipher designed in the 1970s and published in 1989 as the Soviet and Russian standard GOST 28147-89. In order to enhance the security of GOST block cipher after proposing various attacks on it, designers published a modified version of GOST, namely GOST2, in 2015 which has a new key schedule and explicit choice for S-boxes. In this paper, by using three exactly identical portions of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.05475 شماره
صفحات -
تاریخ انتشار 2017